Catalogo dei prodotti della ricerca

Deep networks thrive when trained on large scale data collections. This has given ImageNet a central role in the development of deep architectures for visual object classification. However, ImageNet was created during a specific period in time, and as such it is prone to aging, as well as dataset bias issues. Moving beyond fixed training datasets will lead to more robust visual systems, especially when deployed on robots in new environments which must train on the objects they encounter there. To make this possible, it is important to break free from the need for manual annotators. Recent work has begun to investigate how to use the massive amount of images available on the Web in place of manual image annotations. We contribute to this research thread with two findings: (1) a study correlating a given level of noisily labels to the expected drop in accuracy, for two deep architectures, on two different types of noise, that clearly identifies GoogLeNet as a suitable architecture for learning from Web data; (2) a recipe for the creation of Web datasets with minimal noise and maximum visual variability, based on a visual and natural language processing concept expansion strategy. By combining these two results, we obtain a method for learning powerful deep object models automatically from the Web. We confirm the effectiveness of our approach through object categorization experiments using our Web-derived version of ImageNet on a popular robot vision benchmark database, and on a lifelong object discovery task on a mobile robot.

Learning Deep Visual Object Models From Noisy Web Data: How to Make it Work / Massouh, Nizar; Babiloni, Francesca; Tommasi, Tatiana; Jay, Young; Nick, Hawes; Caputo, Barbara. - ELETTRONICO. - 2017:(2017), pp. 5564-5571. (Intervento presentato al convegno International Conference on Intelligent Robots and Systems (IROS) tenutosi a Vancouver; Canada nel 24 September 2017 - 28 September 2017) [10.1109/IROS.2017.8206444].

Learning Deep Visual Object Models From Noisy Web Data: How to Make it Work

MASSOUH, NIZAR;BABILONI, FRANCESCA;Tatiana Tommasi;Jay Young;Nick Hawes;Barbara Caputo

2017

Abstract

Deep networks thrive when trained on large scale data collections. This has given ImageNet a central role in the development of deep architectures for visual object classification. However, ImageNet was created during a specific period in time, and as such it is prone to aging, as well as dataset bias issues. Moving beyond fixed training datasets will lead to more robust visual systems, especially when deployed on robots in new environments which must train on the objects they encounter there. To make this possible, it is important to break free from the need for manual annotators. Recent work has begun to investigate how to use the massive amount of images available on the Web in place of manual image annotations. We contribute to this research thread with two findings: (1) a study correlating a given level of noisily labels to the expected drop in accuracy, for two deep architectures, on two different types of noise, that clearly identifies GoogLeNet as a suitable architecture for learning from Web data; (2) a recipe for the creation of Web datasets with minimal noise and maximum visual variability, based on a visual and natural language processing concept expansion strategy. By combining these two results, we obtain a method for learning powerful deep object models automatically from the Web. We confirm the effectiveness of our approach through object categorization experiments using our Web-derived version of ImageNet on a popular robot vision benchmark database, and on a lifelong object discovery task on a mobile robot.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
				2017
			
	Nome convegno
	
				International Conference on Intelligent Robots and Systems (IROS)
			
	Parole chiave
	
				robotics, computer vision, learning from the web, machine learning
			
	Tipologia
	
				04 Pubblicazione in atti di convegno::04b Atto di convegno in volume
			
	Citazione
	
				Learning Deep Visual Object Models From Noisy Web Data: How to Make it Work / Massouh, Nizar; Babiloni, Francesca; Tommasi, Tatiana; Jay, Young; Nick, Hawes; Caputo, Barbara. - ELETTRONICO. - 2017:(2017), pp. 5564-5571. (Intervento presentato al  convegno International Conference on Intelligent Robots and Systems (IROS) tenutosi a Vancouver; Canada nel 24 September 2017 - 28 September 2017) [10.1109/IROS.2017.8206444].
			
	Appartiene alla tipologia:
	
				04b Atto di convegno in volume

File allegati a questo prodotto

File	Dimensione	Formato
Massouh_Learning-Deep-Visual_2017.pdf solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 1.07 MB Formato Adobe PDF Contatta l'autore	1.07 MB	Adobe PDF	Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1092191

Citazioni

ND

8

3

social impact